932 resultados para deviance information criteria, model averaging, MCMC, genomewide association studies, epistasis, logistic regression, stochastic search algorithm, case-control studies, Type I diabetes, single nucleotide polymorphism, gene expression programming


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Genetic research of complex diseases is a challenging, but exciting, area of research. The early development of the research was limited, however, until the completion of the Human Genome and HapMap projects, along with the reduction in the cost of genotyping, which paves the way for understanding the genetic composition of complex diseases. In this thesis, we focus on the statistical methods for two aspects of genetic research: phenotype definition for diseases with complex etiology and methods for identifying potentially associated Single Nucleotide Polymorphisms (SNPs) and SNP-SNP interactions. With regard to phenotype definition for diseases with complex etiology, we firstly investigated the effects of different statistical phenotyping approaches on the subsequent analysis. In light of the findings, and the difficulties in validating the estimated phenotype, we proposed two different methods for reconciling phenotypes of different models using Bayesian model averaging as a coherent mechanism for accounting for model uncertainty. In the second part of the thesis, the focus is turned to the methods for identifying associated SNPs and SNP interactions. We review the use of Bayesian logistic regression with variable selection for SNP identification and extended the model for detecting the interaction effects for population based case-control studies. In this part of study, we also develop a machine learning algorithm to cope with the large scale data analysis, namely modified Logic Regression with Genetic Program (MLR-GEP), which is then compared with the Bayesian model, Random Forests and other variants of logic regression.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In population studies, most current methods focus on identifying one outcome-related SNP at a time by testing for differences of genotype frequencies between disease and healthy groups or among different population groups. However, testing a great number of SNPs simultaneously has a problem of multiple testing and will give false-positive results. Although, this problem can be effectively dealt with through several approaches such as Bonferroni correction, permutation testing and false discovery rates, patterns of the joint effects by several genes, each with weak effect, might not be able to be determined. With the availability of high-throughput genotyping technology, searching for multiple scattered SNPs over the whole genome and modeling their joint effect on the target variable has become possible. Exhaustive search of all SNP subsets is computationally infeasible for millions of SNPs in a genome-wide study. Several effective feature selection methods combined with classification functions have been proposed to search for an optimal SNP subset among big data sets where the number of feature SNPs far exceeds the number of observations. ^ In this study, we take two steps to achieve the goal. First we selected 1000 SNPs through an effective filter method and then we performed a feature selection wrapped around a classifier to identify an optimal SNP subset for predicting disease. And also we developed a novel classification method-sequential information bottleneck method wrapped inside different search algorithms to identify an optimal subset of SNPs for classifying the outcome variable. This new method was compared with the classical linear discriminant analysis in terms of classification performance. Finally, we performed chi-square test to look at the relationship between each SNP and disease from another point of view. ^ In general, our results show that filtering features using harmononic mean of sensitivity and specificity(HMSS) through linear discriminant analysis (LDA) is better than using LDA training accuracy or mutual information in our study. Our results also demonstrate that exhaustive search of a small subset with one SNP, two SNPs or 3 SNP subset based on best 100 composite 2-SNPs can find an optimal subset and further inclusion of more SNPs through heuristic algorithm doesn't always increase the performance of SNP subsets. Although sequential forward floating selection can be applied to prevent from the nesting effect of forward selection, it does not always out-perform the latter due to overfitting from observing more complex subset states. ^ Our results also indicate that HMSS as a criterion to evaluate the classification ability of a function can be used in imbalanced data without modifying the original dataset as against classification accuracy. Our four studies suggest that Sequential Information Bottleneck(sIB), a new unsupervised technique, can be adopted to predict the outcome and its ability to detect the target status is superior to the traditional LDA in the study. ^ From our results we can see that the best test probability-HMSS for predicting CVD, stroke,CAD and psoriasis through sIB is 0.59406, 0.641815, 0.645315 and 0.678658, respectively. In terms of group prediction accuracy, the highest test accuracy of sIB for diagnosing a normal status among controls can reach 0.708999, 0.863216, 0.639918 and 0.850275 respectively in the four studies if the test accuracy among cases is required to be not less than 0.4. On the other hand, the highest test accuracy of sIB for diagnosing a disease among cases can reach 0.748644, 0.789916, 0.705701 and 0.749436 respectively in the four studies if the test accuracy among controls is required to be at least 0.4. ^ A further genome-wide association study through Chi square test shows that there are no significant SNPs detected at the cut-off level 9.09451E-08 in the Framingham heart study of CVD. Study results in WTCCC can only detect two significant SNPs that are associated with CAD. In the genome-wide study of psoriasis most of top 20 SNP markers with impressive classification accuracy are also significantly associated with the disease through chi-square test at the cut-off value 1.11E-07. ^ Although our classification methods can achieve high accuracy in the study, complete descriptions of those classification results(95% confidence interval or statistical test of differences) require more cost-effective methods or efficient computing system, both of which can't be accomplished currently in our genome-wide study. We should also note that the purpose of this study is to identify subsets of SNPs with high prediction ability and those SNPs with good discriminant power are not necessary to be causal markers for the disease.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

KLK15 over-expression is reported to be a significant predictor of reduced progression-free survival and overall survival in ovarian cancer. Our aim was to analyse the KLK15 gene for putative functional single nucleotide polymorphisms (SNPs) and assess the association of these and KLK15 HapMap tag SNPs with ovarian cancer survival. Results In silico analysis was performed to identify KLK15 regulatory elements and to classify potentially functional SNPs in these regions. After SNP validation and identification by DNA sequencing of ovarian cancer cell lines and aggressive ovarian cancer patients, 9 SNPs were shortlisted and genotyped using the Sequenom iPLEX Mass Array platform in a cohort of Australian ovarian cancer patients (N = 319). In the Australian dataset we observed significantly worse survival for the KLK15 rs266851 SNP in a dominant model (Hazard Ratio (HR) 1.42, 95% CI 1.02-1.96). This association was observed in the same direction in two independent datasets, with a combined HR for the three studies of 1.16 (1.00-1.34). This SNP lies 15bp downstream of a novel exon and is predicted to be involved in mRNA splicing. The mutant allele is also predicted to abrogate an HSF-2 binding site. Conclusions We provide evidence of association for the SNP rs266851 with ovarian cancer survival. Our results provide the impetus for downstream functional assays and additional independent validation studies to assess the role of KLK15 regulatory SNPs and KLK15 isoforms with alternative intracellular functional roles in ovarian cancer survival.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A recent genome-wide association study reported association between schizophrenia and the ZNF804A gene on chromosome 2q32.1. We attempted to replicate these findings in our Irish Case-Control Study of Schizophrenia (ICCSS) sample (N=1021 cases, 626 controls). Following consultation with the original investigators, we genotyped three of the most promising single-nucleotide polymorphisms (SNPs) from the Cardiff study. We replicate association with rs1344706 (trend test one-tailed P=0.0113 with the previously associated A allele) in ZNF804A. We detect no evidence of association with rs6490121 in NOS1 (one-tailed P=0.21), and only a trend with rs9922369 in RGRIP1L (one-tailed P=0.0515). On the basis of these results, we completed genotyping of 11 additional linkage disequilibrium-tagging SNPs in ZNF804A. Of 12 SNPs genotyped, 11 pass quality control criteria and 4 are nominally associated, with our most significant evidence of association at rs7597593 (P=0.0013) followed by rs1344706. We observe no evidence of differential association in ZNF804A on the basis of family history or sex of case. The associated SNP rs1344706 lies in approximately 30 bp of conserved mammalian sequence, and the associated A allele is predicted to maintain binding sites for the brain-expressed transcription factors MYT1l and POU3F1/OCT-6. In controls, expression is significantly increased from the A allele of rs1344706 compared with the C allele. Expression is increased in schizophrenic cases compared with controls, but this difference does not achieve statistical significance. This study replicates the original reported association of ZNF804A with schizophrenia and suggests that there is a consistent link between the A allele of rs1344706, increased expression of ZNF804A and risk for schizophrenia.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With hundreds of single nucleotide polymorphisms (SNPs) in a candidate gene and millions of SNPs across the genome, selecting an informative subset of SNPs to maximize the ability to detect genotype-phenotype association is of great interest and importance. In addition, with a large number of SNPs, analytic methods are needed that allow investigators to control the false positive rate resulting from large numbers of SNP genotype-phenotype analyses. This dissertation uses simulated data to explore methods for selecting SNPs for genotype-phenotype association studies. I examined the pattern of linkage disequilibrium (LD) across a candidate gene region and used this pattern to aid in localizing a disease-influencing mutation. The results indicate that the r2 measure of linkage disequilibrium is preferred over the common D′ measure for use in genotype-phenotype association studies. Using step-wise linear regression, the best predictor of the quantitative trait was not usually the single functional mutation. Rather it was a SNP that was in high linkage disequilibrium with the functional mutation. Next, I compared three strategies for selecting SNPs for application to phenotype association studies: based on measures of linkage disequilibrium, based on a measure of haplotype diversity, and random selection. The results demonstrate that SNPs selected based on maximum haplotype diversity are more informative and yield higher power than randomly selected SNPs or SNPs selected based on low pair-wise LD. The data also indicate that for genes with small contribution to the phenotype, it is more prudent for investigators to increase their sample size than to continuously increase the number of SNPs in order to improve statistical power. When typing large numbers of SNPs, researchers are faced with the challenge of utilizing an appropriate statistical method that controls the type I error rate while maintaining adequate power. We show that an empirical genotype based multi-locus global test that uses permutation testing to investigate the null distribution of the maximum test statistic maintains a desired overall type I error rate while not overly sacrificing statistical power. The results also show that when the penetrance model is simple the multi-locus global test does as well or better than the haplotype analysis. However, for more complex models, haplotype analyses offer advantages. The results of this dissertation will be of utility to human geneticists designing large-scale multi-locus genotype-phenotype association studies. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Endometrial cancer is one of the most common female diseases in developed nations and is the most commonly diagnosed gynaecological cancer in Australia. The disease is commonly classified by histology: endometrioid or non-endometrioid endometrial cancer. While non-endometrioid endometrial cancers are accepted to be high-grade, aggressive cancers, endometrioid cancers (comprising 80% of all endometrial cancers diagnosed) generally carry a favourable patient prognosis. However, endometrioid endometrial cancer patients endure significant morbidity due to surgery and radiotherapy used for disease treatment, and patients with recurrent disease have a 5-year survival rate of less than 50%. Genetic analysis of women with endometrial cancer could uncover novel markers associated with disease risk and/or prognosis, which could then be used to identify women at high risk and for the use of specialised treatments. Proteases are widely accepted to play an important role in the development and progression of cancer. This PhD project hypothesised that SNPs from two protease gene families, the matrix metalloproteases (MMPs, including their tissue inhibitors, TIMPs) and the tissue kallikrein-related peptidases (KLKs) would be associated with endometrial cancer susceptibility and/or prognosis. In the first part of this study, optimisation of the genotyping techniques was performed. Results from previously published endometrial cancer genetic association studies were attempted to be validated in a large, multicentre replication set (maximum cases n = 2,888, controls n = 4,483, 3 studies). The rs11224561 progesterone receptor SNP (PGR, A/G) was observed to be associated with increased endometrial cancer risk (per A allele OR 1.31, 95% CI 1.12-1.53; p-trend = 0.001), a result which was initially reported among a Chinese sample set. Previously reported associations for the remaining 8 SNPs investigated for this section of the PhD study were not confirmed, thereby reinforcing the importance of validation of genetic association studies. To examine the effect of SNPs from the MMP and KLK families on endometrial cancer risk, we selected the most significantly associated MMP and KLK SNPs from genome-wide association study analysis (GWAS) to be genotyped in the GWAS replication set (cases n = 4,725, controls n = 9,803, 13 studies). The significance of the MMP24 rs932562 SNP was unchanged after incorporation of the stage 2 samples (Stage 1 per allele OR 1.18, p = 0.002; Combined Stage 1 and 2 OR 1.09, p = 0.002). The rs10426 SNP, located 3' to KLK10 was predicted by bioinformatic analysis to effect miRNA binding. This SNP was observed in the GWAS stage 1 result to exhibit a recessive effect on endometrial cancer risk, a result which was not validated in the stage 2 sample set (Stage 1 OR 1.44, p = 0.007; Combined Stage 1 and 2 OR 1.14, p = 0.08). Investigation of the regions imputed surrounding the MMP, TIMP and KLK genes did not reveal any significant targets for further analysis. Analysis of the case data from the endometrial cancer GWAS to identify genetic variation associated with cancer grade did not reveal SNPs from the MMP, TIMP or KLK genes to be statistically significant. However, the representation of SNPs from the MMP, TIMP and KLK families by the GWAS genotyping platform used in this PhD project was examined and observed to be very low, with the genetic variation of four genes (MMP23A, MMP23B, MMP28 and TIMP1) not captured at all by this technique. This suggests that comprehensive candidate gene association studies will be required to assess the role of SNPs from these genes with endometrial cancer risk and prognosis. Meta-analysis of gene expression microarray datasets curated as part of this PhD study identified a number of MMP, TIMP and KLK genes to display differential expression by endometrial cancer status (MMP2, MMP10, MMP11, MMP13, MMP19, MMP25 and KLK1) and histology (MMP2, MMP11, MMP12, MMP26, MMP28, TIMP2, TIMP3, KLK6, KLK7, KLK11 and KLK12). In light of these findings these genes should be prioritised for future targeted genetic association studies. Two SNPs located 43.5 Mb apart on chromosome 15 were observed from the GWAS analysis to be associated with increased endometrial cancer grade, results that were validated in silico in two independent datasets. One of these SNPs, rs8035725 is located in the 5' untranslated region of a MYC promoter binding protein DENND4A (Stage 1 OR 1.15, p = 9.85 x 10P -5 P, combined Stage 1 and in silico validation OR 1.13, p = 5.24 x 10P -6 P). This SNP has previously been reported to alter the expression of PTPLAD1, a gene involved in the synthesis of very long fatty acid chains and in the Rac1 signaling pathway. Meta-analysis of gene expression microarray data found PTPLAD1 to display increased expression in the aggressive non-endometrioid histology compared with endometrioid endometrial cancer, suggesting that the causal SNP underlying the observed genetic association may influence expression of this gene. Neither rs8035725 nor significant SNPs identified by imputation were predicted bioinformatically to affect transcription factor binding sites, indicating that further studies are required to assess their potential effect on other regulatory elements. The other grade- associated SNP, rs6606792, is located upstream of an inferred pseudogene, ELMO2P1 (Stage 1 OR 1.12, p = 5 x 10P -5 P; combined Stage 1 and in silico validation OR 1.09, p = 3.56 x 10P -5 P). Imputation of the ±1 Mb region surrounding this SNP revealed a cluster of significantly associated variants which are predicted to abolish various transcription factor binding sites, and would be expected to decrease gene expression. ELMO2P1 was not included on the microarray platforms collected for this PhD, and so its expression could not be investigated. However, the high sequence homology of ELMO2P1 with ELMO2, a gene important to cell motility, indicates that ELMO2 could be the parent gene for ELMO2P1 and as such, ELMO2P1 could function to regulate the expression of ELMO2. Increased expression of ELMO2 was seen to be associated with increasing endometrial cancer grade, as well as with aggressive endometrial cancer histological subtypes by microarray meta-analysis. Thus, it is hypothesised that SNPs in linkage disequilibrium with rs6606792 decrease the transcription of ELMO2P1, reducing the regulatory effect of ELMO2P1 on ELMO2 expression. Consequently, ELMO2 expression is increased, cell motility is enhanced leading to an aggressive endometrial cancer phenotype. In summary, these findings have identified several areas of research for further study. The results presented in this thesis provide evidence that a SNP in PGR is associated with risk of developing endometrial cancer. This PhD study also reports two independent loci on chromosome 15 to be associated with increased endometrial cancer grade, and furthermore, genes associated with these SNPs to be differentially expressed according in aggressive subtypes and/or by grade. The studies reported in this thesis support the need for comprehensive SNP association studies on prioritised MMP, TIMP and KLK genes in large sample sets. Until these studies are performed, the role of MMP, TIMP and KLK genetic variation remains unclear. Overall, this PhD study has contributed to the understanding of genetic variation involvement in endometrial cancer susceptibility and prognosis. Importantly, the genetic regions highlighted in this study could lead to the identification of novel gene targets to better understand the biology of endometrial cancer and also aid in the development of therapeutics directed at treating this disease.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background Breast cancer (BC) is primarily considered a genetic disorder with a complex interplay of factors including age, gender, ethnicity, family history, personal history and lifestyle with associated hormonal and non-hormonal risk factors. The SNP rs2910164 in miR146a (a G to C polymorphism) was previously associated with increased risk of BC in cases with at least a single copy of the C allele in breast cancer, though results in other cancers and populations have shown significant variation. Methods In this study, we examined this SNP in an Australian sporadic breast cancer population of 160 cases and matched controls, with a replicate population of 403 breast cancer cases using High Resolution Melting. Results Our analysis indicated that the rs2910164 polymorphism is associated with breast cancer risk in both primary and replicate populations (p = 0.03 and 0.0013, respectively). In contrast to the results of familial breast cancer studies, however, we found that the presence of the G allele of rs2910164 is associated with increased cancer risk, with an OR of 1.77 (95% CI 1.40–2.23). Conclusions The microRNA miR146a has a potential role in the development of breast cancer and the effects of its SNPs require further inquiry to determine the nature of their influence on breast tissue and cancer.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Diabetic nephropathy (DN) is the primary cause of morbidity and mortality in patients with type 1 diabetes mellitus (T1DM) and affects about 30% of these patients. We have previously localized a DN locus on chromosome 3q with suggestive linkage in Finnish individuals. Linkage to this region has also been reported earlier by several other groups. To fine map this locus, we conducted a multistage case-control association study in T1DM patients, comprising 1822 cases with nephropathy and 1874 T1DM patients free of nephropathy, from Finland, Iceland, and the British Isles. At the screening stage, we genotyped 3072 tag SNPs, spanning a 28 Mb region, in 234 patients and 215 controls from Finland. SNPs that met the significance threshold of p

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract


AIMS/HYPOTHESIS:

Retinal vascular calibre changes may reflect early subclinical microvascular disease in diabetes. Because of the considerable homology between retinal and cerebral microcirculation, we examined whether retinal vascular calibre, as a proxy of cerebral microvascular disease, was associated with cognitive function in older people with type 2 diabetes.

METHODS:

A cross-sectional analysis of 954 people aged 60-75 years with type 2 diabetes from the population-based Edinburgh Type 2 Diabetes Study was performed. Participants underwent standard seven-field binocular digital retinal photography and a battery of seven cognitive function tests. The Mill Hill Vocabulary Scale was used to estimate pre-morbid cognitive ability. Retinal vascular calibre was measured from an image field with the optic disc in the centre using a validated computer-based program.

RESULTS:

After age and sex adjustment, larger retinal arteriolar and venular calibres were significantly associated with lower scores for the Wechsler Logical Memory test, with standardised regression coefficients -0.119 and -0.084, respectively (p?<?0.01), but not with other cognitive tests. There was a significant interaction between sex and retinal vascular calibre for logical memory. In male participants, the association of increased retinal arteriolar calibre with logical memory persisted (p?<?0.05) when further adjusted for vocabulary, venular calibre, depression, cardiovascular risk factors and macrovascular disease. In female participants, this association was weaker and not significant.

CONCLUSIONS/INTERPRETATION:

Retinal arteriolar dilatation was associated with poorer memory, independent of estimated prior cognitive ability in older men with type 2 diabetes. The sex interaction with stronger findings in men requires confirmation. Nevertheless, these data suggest that impaired cerebral arteriolar autoregulation in smooth muscle cells, leading to arteriolar dilatation, may be a possible pathogenic mechanism in verbal declarative memory decrements in people with diabetes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a control strategy for blood glucose(BG) level regulation in type 1 diabetic patients. To design the controller, model-based predictive control scheme has been applied to a newly developed diabetic patient model. The controller is provided with a feedforward loop to improve meal compensation, a gain-scheduling scheme to account for different BG levels, and an asymmetric cost function to reduce hypoglycemic risk. A simulation environment that has been approved for testing of artificial pancreas control algorithms has been used to test the controller. The simulation results show a good controller performance in fasting conditions and meal disturbance rejection, and robustness against model–patient mismatch and errors in meal estimation

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Solitary keratoacanthoma (KA) is a common benign epithelial tumor of the skin characterized by rapid growth and a tendency toward spontaneous regression. The exact etiology and classification of KA are a matter of debate. Smokers also seem to be more affected than persons who never smoke. The objective of this study was to evaluate the association between solitary KA and smoking habit. A case-control study involving 78 patients diagnosed with KA and 199 controls from the related community was performed to evaluate the association between cigarette smoking and KA. A higher smoking prevalence was noted in cases (69.2 %) than controls (21.6 %) and the odds ratio adjusted for sex and age was 9.1 (95 % CI 4.9 to 17.1, p< 0.01). The mean tumoral diameter at surgery and the site of involvement was not statistically related to smoking. These findings suggest that cigarette smoking is associated with the development of KA. © 2006 Dermatology Online Journal.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)